Three essays on the evaluation of informality in the Chilean labor market using high-dimensional methods

Dissertation Defense

Cynthia Cáceres

PhD. Student
Universidad de Talca

Andrés Riquelme

Supervisor
Universidad de Talca

November 10, 2025

Motivation

Analyzing informality in the Chilean labor market is complex.

We explore informality from a multidimensional perspective in three stages using hig-dimensional statistical tools:

  1. Decision to work informally.
  2. Monetary gain or loss from working formally or informally.
  3. Monetary gain or loss from working formally or informally by status in employment.

Chapter 1: Understanding informal employment in Chile

Are there nonlinear and interactive effects between individual and household characteristics that influence the decision to participate in informal employment?

Introduction

Informal employment rate in Chile: 27.5%.

Microeconomics factors:

Definition of Informal Worker: ILO Definition

  • Unpaid family workers: all.
  • Employees: not paid social security contributions or health insurance.
  • Employers: firm has five or fewer employees nationwide.
  • Own-account workers: their occupation is not classified as “Members of the executive and legislative bodies and managerial personnel of the public administration and companies’’,”Scientific and intellectual professional” or “Technicians and mid-level professionals”.

Methodology: Probit model

\[ \begin{equation*} \mathrm{P} \left(y_{i} = 1 \right) = \Phi \left( \boldsymbol{x}_{i}^{\intercal} \boldsymbol{\beta}\right) \end{equation*} \]

where,

  • \(y_{i}\) represents informal employment participation, equal to one if individual \(i\) is an informal worker and zero otherwise.
  • \(\boldsymbol{x}_{i}\) is an individual characteristic vector selected by adaptive lasso.

Results: Number of controls selected by ADAPTIVE LASSO

Variables selected
(\(\hat{\beta}_{ALASSO,j} \neq 0\))
Full sample Men Women
Linear (#) 6 4 2
Interactive (#) 648 583 312
Total 654 587 314
IMR selected Yes Yes Yes
RMSE 0.117 0.109 0.129

Notes: Interactive variables correspond to nonlinear and interaction variables. imr refers to Inverse Mills Ratio term, included as a possible control for selection bias.

Full Sample

Increase probability of working informally:

  • Annulled \(\times\) Illiteracy
  • Annulled \(\times\) Other community service activities
  • Annulled \(\times\) Manufacturing Industries

Decrease probability of working informally:

  • Annulled \(\times\) Large enterprise
  • Annulled \(\times\) Number of children under six years
  • Annulled \(\times\) Teaching

Conclusions

Decision to work informally depends not only on individual-level characteristics in a linear form, but also on their interactions.

Cross-category interactions consistently decrease the probability of working informally:

  • Contract Type \(\times\) Administrative Zone (All, men, and women).
  • Contract Type \(\times\) Firm Size (All, men, and women).
  • Contract Type \(\times\) Nationality (Men).
  • Firm Size \(\times\) Administrative Zone (Women).
  • Contract Type \(\times\) Sexual Orientation (Women).

Marital Status \(\times\) Contract Type increases the probability of working informally for women.

Policy interventions should take account heterogeneity of subgroups for the distributive efficiency of public resources.

Chapter 2: Wage gap determinants between formal and informal workers and wage structure in Chile

What are the main determinants of the wage structure of formal and informal workers in Chile, and which factors contribute to the wage gap between these two groups?

Motivation and Literature Review

Raw wage gap between formal and informal workers: 60.3%

Wage premium formal workers

Wage penalty informal workers

Motivation and Literature Review

Wage gap between formal and informal workers

Informal worker definition following ILO definition

  • Unpaid family workers: all.
  • Employees: not paid social security contributions or health insurance.
  • Employers: firm has five or fewer employees nationwide.
  • Own-account workers: their occupation is not classified as “Members of the executive and legislative bodies and managerial personnel of the public administration and companies’’,”Scientific and intellectual professional” or “Technicians and mid-level professionals”.

Methodology: ADAPTIVE LASSO

Following Adaptive Least Absolute Shrinkage and Selection Operator (adaptive lasso) methodology proposed by Zou (2006).

\[ \begin{equation*} \DeclareMathOperator*{\argmin}{arg\,min} \boldsymbol{\hat{\beta}}^{*(n)}_{\mathsf{ALASSO}} = \argmin_{\{\beta_{1},\dots, \beta_{K} \} }\Bigg\| \boldsymbol{y}-\sum_{j=1}^{K}\boldsymbol{x}_{j}\beta_{j}\Bigg\|^{2}+\lambda\sum_{j=1}^{K}\hat{w}_{j}|\beta_{j}|, \end{equation*} \] where \(\lambda\) is a non–negative regularization parameter selected using cross–validation, \(\boldsymbol{\widehat{w}}=\{ \widehat{w}_{1},\dots,\hat{w}_{K} \}\) is a vector of weights defined as \(\boldsymbol{\hat{w}}= \frac{1}{|\boldsymbol{\hat{\beta}_{RIDGE}}|^{\gamma}}\), with \(\gamma = 1\).

Our relevant subset of predictors \(A^{*}_{n}\) corresponds to the non–zero parameters for formal and/or informal workers: \(A^{*}_{n}=\{j:\hat{\beta}^{F}_{\mathsf{Alasso}, j}\neq 0 \; \vee \; \hat{\beta}^{I}_{\mathsf{ALASSO}, j}\neq 0 \}\).

The Sparse Matrix

\[ \underset{(n_F+n_I, K)}{\mathbf{X}} = \left( \begin{array}{cccc|cccc|c} x_{11}^{\text{F}} & x_{12}^{\text{F}} & \cdots & x_{1k}^{\text{F}} & 0 & 0 &\cdots & 0 & IMR^{F}_{1} \\ x_{21}^{\text{F}} & x_{22}^{\text{F}} & \cdots & x_{2k}^{\text{F}} & 0 & 0 &\cdots & 0 & IMR^{F}_{2} \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots &\ddots & \vdots & \vdots \\ x_{n_{F},1}^{\text{F}} & x_{n_F,2}^{\text{F}} & \cdots & x_{n_F,k}^{\text{F}} & 0 & 0 &\cdots & 0 & IMR^{F}_{n_{F}} \\ \hline 0 & 0 & \cdots & 0 & x_{11}^{\text{I}} & x_{12}^{\text{I}} & \cdots & x_{1k}^{\text{I}} & IMR^{I}_{1} \\ 0 & 0 & \cdots & 0 & x_{21}^{\text{I}} & x_{22}^{\text{I}} & \cdots & x_{2k}^{\text{I}} & IMR^{I}_{2} \\ \vdots & \vdots& \ddots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 0 & x_{n_I,1}^{\text{I}}& x_{n_I,2}^{\text{I}}& \cdots & x_{n_I,k}^{\text{I}} & IMR^{I}_{n_{I}} \\ \end{array} \right) \]

The Sparse Matrix: Interpreting the Results

For a variable \(j\)

\(x_{j}^{F}\) selected
(\(\hat{\beta}_{j}^{F} \neq 0\))
\(x_{j}^{I}\) selected
(\(\hat{\beta}_{j}^{I} \neq 0\))
Interpretation
Variable \(x_{j}\) only affects the wage of formal workers
Variable \(x_{j}\) only affects the wage of informal workers
Variable \(x_{j}\) affects the wage of formal and informal workers (wage gap)

Results: Controls selected

Explain wage of Linear Nonlinear Total
Formal workers
(\(\hat{\beta}_{j}^{F} \neq 0 \quad \wedge \quad \hat{\beta}_{j}^{I} = 0\))
3 254 257
Informal workers
(\(\hat{\beta}_{j}^{F} = 0 \quad \wedge \quad \hat{\beta}_{j}^{I} \neq 0\))
2 383 385
Formal and informal workers
(\(\hat{\beta}_{j}^{F} \neq 0 \quad \wedge \quad \hat{\beta}_{j}^{I} \neq 0\))
2 183 185
Total Controls
(\(\hat{\beta}_{j}^{F} = 0 \quad \vee \quad \hat{\beta}_{j}^{I} \neq 0\))
7 820 827
imr 1 1
Total Controls selected 8 820 828

Notes: The table reports the number of controls selected by adaptive lasso from 2,612 potential controls (including quadratic terms and interactions). \(\hat{\beta}_{j}^{F}\) and \(\hat{\beta}_{j}^{I}\) represent the estimated coefficients for variable \(x_{j}\) for formal and informal workers, respectively. A variable \(x_{j}\) explain only the wage of formal workers when \(\hat{\beta}_{j}^{F} \neq 0 \; \wedge \; \hat{\beta}_{j}^{I}=0\), and vice versa. A variable \(x_{j}\) explain the wage gap when \(\hat{\beta}_{j}^{F} \neq 0 \; \wedge \; \hat{\beta}_{j}^{I} \neq 0\). Nonlinear variables refers to quadratic and interaction terms.

Increase wage gap:

  • Both receive wage premium: Dual nationlity \(\times\) Illiteracy.

  • Both receive wage penalty: Foreigner \(\times\) Bisexual.

  • Opposite effects: Atacama Region \(\times\) Dual nationality.

Formal workers:

  • Increase hourly wage: Extraterritorial organizations \(\times\) Married.
  • Decrease hourly wage: Divorced \(\times\) Bisexual.

Informal workers:

  • Increase hourly wage: Transportation sector \(\times\) Cohab. with civil union.
  • Decrease hourly wage: Medium-sized enterprise \(\times\) Other sexual orientation

Plot

Conlcusions

Nonlinearity relevant in the wage structure of formal workers amd informal workers.

Wage gap between formal and informal workers is mainly explained by nonlinearities.

Chapter 3: Wage structure and wage gap determinants between formal and informal workers by status in employment,
Chile 2004-2015

What factors influence the wage structure of informal and informal workers, and what characteristics explain the wage gap between the two groups according to their type of employment?

Data

Data used: eps 2004, eps 2006, eps 2009, and eps 2015.

17.78% of people were in informal employment in Chile between 2004 and 2015.

Informal workers:

  • Unpaid family workers (2.05%).
  • Salaried workers (8.49%).
  • Employers (27.39%).
  • Self-employed workers (62.07%).

We use 38 variables (individual and job characteristics) to create a second-degree polynomial.

Methodology: OLS POST-LASSO (Belloni et al. 2016)

\[ \begin{equation*} y_{it}=\boldsymbol{x}_{it}^{T}\boldsymbol{\beta}+\hat{\lambda}_{it}+d2_{t}\hat{\lambda}_{it}+\cdots+dT_{t}\hat{\lambda}_{it} + \mu_{i}+\varepsilon_{it} \end{equation*} \] where,

  • \(\hat{\lambda}_{it}\) corresponds to the Inverse Mills Ratio for the individual \(i\) in time \(t\).
  • \(\boldsymbol{x}_{it}\) is a characteristics vector for the individual \(i\) in time \(t\) from the Sparse Matrix.
  • \(d2_{t}\) to \(dT_{t}\) are time dummy variables.

Number of controls selected and estimated by OLS Post-LASSO

Employees Self-employed Employers
Linear Nonlinear Total Linear Nonlinear Total Linear Nonlinear Total
Wage Structure Formal Worker
(\(\hat{\beta}_{j}^{F} \neq 0 \; \wedge \hat{\beta}_{j}^{I} =0\))
5 0 5 0 0 0 36 1 37
Wage Structure Informal Worker
(\(\hat{\beta}_{j}^{F} = 0 \; \wedge \hat{\beta}_{j}^{I} \neq 0\))
1 0 1 0 0 0 64 1 65
Wage gap
(\(\hat{\beta}_{j}^{F} \neq 0 \; \wedge \hat{\beta}_{j}^{I} \neq0\))
1 1 2 1 1 2 55 2 57
Total controls 7 1 8 1 1 2 155 4 159

Notas: The table reports the number of controls selected by OLS Post-LASSO by type of job, from 547 potential controls (including quadratic terms and interactions). \(\hat{\beta}_{j}^{F}\) and \(\hat{\beta}_{j}^{I}\) represent the estimated coefficients for variable \(x_{j}\) for formal and informal workers, respectively. A variable \(x_{j}\) affect only to formal workers when \(\hat{\beta}_{j}^{F} \neq 0\; \wedge \;\hat{\beta}_{j}^{I} = 0\), and vice versa. A variable \(x_{j}\) affect the wage gap between formal/informal workers when \(\hat{\beta}_{j}^{F} \neq 0\; \wedge \;\hat{\beta}_{j}^{I} \neq 0\). Nonlinear variables refers to quadratic and interaction terms.

Employers

Increase wage gap:

  • Both receive wage premium: Male \(\times\) Head of household.

  • Both receive wage penalty: High school \(\times\) Araucania Region.

  • Opposite effects: Separated \(\times\) Santiago Metropolitan Region.

Formal workers:

  • Increase hourly wage: Separated \(\times\) Los Lagos Region.
  • Decrease hourly wage: Associate Degree \(\times\) Valparaiso Region.

Informal workers:

  • Increase hourly wage: Tarapaca Region \(\times\) Seasonal work.
  • Decrease hourly wage: Married \(\times\) Araucania Region.

Conclusions

Nonlinear effects are relevant in the log wage generation process for employers, self-employed, and salaried workers.

Public policies should take account the heterogeneity of labor market and consider individual and job characteristics that affect income generation.

References

Amuedo-Dorantes, Catalina. 2004. Determinants and Poverty Implications of Informal Sector Work in Chile.” Economic Development and Cultural Change 52 (2): 347–68. https://doi.org/10.1086/380926.
Angel-Urdinola, Diego F, Kimie Tanabe, Caroline Freund, Mohamad Allouche, Anne Hilger, Norman Loayza, William Maloney, Jaime Saenz, Joana Silva, and May Wazzan. 2012. Micro-Determinants of Informal Employment in The Middle East and North Africa Region.” www.worldbank.org/sp.
Balcar, Jiří. 2012. “Supply Side Wage Determinants: Overview of Empirical Literature.” Review of Economic Perspectives 12 (January): 207–22. https://doi.org/10.2478/v10135-012-0010-x.
Bargain, Olivier, and Prudence Kwenda. 2011. Earnings Structures, Informal Employment, And Self-Employment: New Evidence From Brazil, Mexico, And South Africa.” Review of Income and Wealth 57 (SUPPL. 1). https://doi.org/10.1111/j.1475-4991.2011.00454.x.
———. 2014. The Informal Sector Wage Gap: New Evidence Using Quantile Estimations on Panel Data.” Economic Development and Cultural Change 63 (October): 117–53. https://doi.org/10.1086/677908.
Belloni, Alexandre, Victor Chernozhukov, Christian Hansen, and Damian Kozbur. 2016. “Inference in High-Dimensional Panel Models with an Application to Gun Control.” Journal of Business & Economic Statistics 34: 590–605.
Chen, Guifu, and Shigeyuki Hamori. 2013. “Formal and Informal Employment and Income Differentials in Urban China.” Journal of International Development 25 (October): 987–1004. https://doi.org/10.1002/jid.1825.
Cho, Joonmo, and Donghun Cho. 2011. “Gender Difference of the Informal Sector Wage Gap: A Longitudinal Analysis for the Korean Labor Market.” JOURNAL OF THE ASIA PACIFIC ECONOMY 16: 612–29. https://doi.org/10.1080/13547860.2011.621363.
Falco, Paolo, Andrew Kerr, Neil Rankin, Justin Sandefur, and Francis Teal. 2011. “The Returns to Formality and Informality in Urban Africa.” LABOUR ECONOMICS 18 (December): S23–31. https://doi.org/10.1016/j.labeco.2011.09.002.
Gunter, Samara R. 2017. “Dynamics of Urban Informal Labor Supply in the United States*.” Social Science Quarterly 98 (March): 16–36. https://doi.org/10.1111/SSQU.12284.
Kumar, Manik, and Sweety Pandey. 2021. “Wage Gap Between Formal and Informal Regular Workers in India: Evidence from the National Sample Survey.” Global Journal of Emerging Market Economies 13 (January): 104–21. https://doi.org/10.1177/0974910121989458.
Maurizio, Roxana. 2012. “Labour Informality in Latin America: The Case of Argentina, Chile, Brazil and Peru.” Brooks World Poverty Institute Working Paper. http://ssrn.com/abstract=2062337www.manchester.ac.uk/bwpiElectroniccopyavailableat:https://ssrn.com/abstract=2062337Electroniccopyavailableat:http://ssrn.com/abstract=2062337.
Meghir, Costas, Renata Narita, and Jean-Marc Robin. 2015. “Wages and Informality in Developing Countries.” The American Economic Review 105: 1509–46. http://www.jstor.org/stable/43495426.
Nordman, Christophe J., Faly Rakotomanana, and François Roubaud. 2016. “Informal Versus Formal: A Panel Data Analysis of Earnings Gaps in Madagascar.” World Development 86 (October): 1–17. https://doi.org/10.1016/j.worlddev.2016.05.006.
Ramoni-Perazzi, Josefa, and Giampaolo Orlandoni-Merli. 2021. Analysis of the Formal/Informal Wage Inequalities in Colombia: A Semiparametric Approach.” Journal of Applied Social Science 15 (1): 107–31. https://doi.org/10.1177/1936724420975343.
Rosser, J. Barkley, Marina V. Rosser, and Ehsan Ahmed. 2000. “Income Inequality and the Informal Economy in Transition Economies.” Journal of Comparative Economics 28 (March): 156–71. https://doi.org/10.1006/JCEC.2000.1645.
Tansel, Aysit, Halil Ibrahim Keskin, and Zeynel Abidin Ozdemir. 2020. “Is There an Informal Employment Wage Penalty in Egypt? Evidence from Quantile Regression on Panel Data.” Empirical Economics 58 (June): 2949–79. https://doi.org/10.1007/S00181-019-01651-2/TABLES/6.
Williams, Colin C., Muhammad S. Shahid, and Alvaro Martínez. 2016. “Determinants of the Level of Informality of Informal Micro-Enterprises: Some Evidence from the City of Lahore, Pakistan.” World Development 84 (August): 312–25. https://doi.org/10.1016/J.WORLDDEV.2015.09.003.
Zou, Hui. 2006. “The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association 101 (December): 1418–29. https://doi.org/10.1198/016214506000000735.